Clustering Binary Data with Bernoulli Mixture Models
نویسنده
چکیده
Clustering is an unsupervised learning technique that seeks “natural” groupings in data. One form of data that has not been widely studied in the context of clustering is binary data. A rich statistical framework for clustering binary data is the Bernoulli mixture model for which there exists both Bayesian and non-Bayesian approaches. This paper reviews the development and application of Bernoulli mixture models to clustering binary data.
منابع مشابه
Variational Inference for Beta-Bernoulli Dirichlet Process Mixture Models
A commonly used paradigm in diverse application areas is to assume that an observed set of individual binary features is generated from a Bernoulli distribution with probabilities varying according to a Beta distribution. In this paper, we present our nonparametric variational inference algorithm for the Beta-Bernoulli observation model. Our primary focus is clustering discrete binary data usin...
متن کاملFuzzy clustering of spatial binary data
An iterative fuzzy clustering method is proposed to partition a set of multivariate binary observation vectors located at neighboring geographic sites. The method described here applies in a binary setup a recently proposed algorithm, called Neighborhood EM, which seeks a a partition that is both well clustered in the feature space and spatially regular [2]. This approach is derived from the EM...
متن کاملReliable Learning of Bernoulli Mixture Models
Abstract In this paper, we have derived a set of sufficient conditions for reliable clustering of data produced by Bernoulli Mixture Models (BMM), when the number of clusters is unknown. A BMM refers to a random binary vector whose components are independent Bernoulli trials with clusterspecific frequencies. The problem of clustering BMM data arises in many real-world applications, most notably...
متن کاملA parametric mixture model for clustering multivariate binary data
The traditional latent class analysis (LCA) uses a mixture model with binary responses on each subject that are independent conditional on cluster membership. However, in many practical applications, the responses are correlated because they are observed on the same subject; this is known as local dependence. In this paper, we extend the LCA model to allow for local dependence in each cluster t...
متن کاملRobust Place Recognition within Multi-sensor View Sequences Using Bernoulli Mixture Models
This article reports on the use of Hidden Markov Models to improve the results of Localization within a sequence of Sensor Views. Local image features (SIFT) and multiple types of features from a 2D laser range scan are all converted into binary form and integrated into a single, binary, Feature Incidence Matrix (FIM). To reduce the large dimensionality of the binary data, it is modeled in term...
متن کامل